Inside-Outside Estimation of a Lexicalized PCFG for German

نویسندگان

  • Franz Beil
  • Glenn Carroll
  • Detlef Prescher
  • Stefan Riezler
  • Mats Rooth
چکیده

The paper describes an extensive experiment in inside-outside estimation of a lexicalized proba-bilistic context free grammar for German verb-final clauses. Grammar and formalism features which make the experiment feasible are described. Successive models are evaluated on precision and recall of phrase markup.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Disambiguation of Morphological Structure using a PCFG

German has a productive morphology and allows the creation of complex words which are often highly ambiguous. This paper reports on the development of a head-lexicalized PCFG for the disambiguation of German morphological analyses. The grammar is trained on unlabeled data using the Inside-Outside algorithm. The parser achieves a precision of more than 68% on difficult test data, which is 23% mo...

متن کامل

Valence Induction with a Head-Lexicalized PCFG

Either directly or indirectly, the lexicon for a natural language specifies complementation frames or valences for open-class words such as verbs and nouns. Constructing a lexicon of complementation frames for large vocabularies constitutes a challenge of scale, with the further complication that frame usage, like vocabulary, varies with genre and undergoes ongoing innovation in a living langua...

متن کامل

Scalable Discriminative Parsing for German

Generative lexicalized parsing models, which are the mainstay for probabilistic parsing of English, do not perform as well when applied to languages with different language-specific properties such as free(r) word order or rich morphology. For German and other non-English languages, linguistically motivated complex treebank transformations have been shown to improve performance within the frame...

متن کامل

Spatial Random Trees and the Center-Surround Algorithm

A new class of multiscale stochastic processes called spatial random trees (SRTs) is introduced and studied. As with previous multiscale stochastic processes, SRTs model multidimensional signals using random processes on trees. Our key innovation, however, is that the tree structure itself is random and is generated by a probabilistic context-free grammar (PCFG) [26]. While PCFGs have been used...

متن کامل

Lexicalization in Crosslinguistic Probabilistic Parsing: The Case of French

This paper presents the first probabilistic parsing results for French, using the recently released French Treebank. We start with an unlexicalized PCFG as a baseline model, which is enriched to the level of Collins’ Model 2 by adding lexicalization and subcategorization. The lexicalized sister-head model and a bigram model are also tested, to deal with the flatness of the French Treebank. The ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره cs.CL/9905009  شماره 

صفحات  -

تاریخ انتشار 1999